Session Segmentation Based on Document Metadata

نویسنده

  • Tomáš Kramár
چکیده

It has been shown that the search personalization can greatly benefit from exploiting user’s short-term context – his immediate needs and focus. But to achieve that, we need to be able to tell when the context changes; we need to be able to divide the user’s activity into segments, where each segment captures user’s single goal and focus. Many different approaches exist, but their major weakness is that they build inaccurate models that do not include user’s implicit feedback. We present a method for segmenting queries into search sessions which is based on document metadata and incorporates implicit feedback and as such is able to build more accurate context model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Analysis And Classification Based On Passing Window

In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...

متن کامل

Detecting Search Sessions Using Document Metadata and Implicit Feedback

It has been shown that search personalization can greatly benefit from exploiting user’s short-term context – user’s immediate need and intent. However, this requires that the search engine must be able to divide user’s activity into segments, where each segment captures user’s single goal and focus. Several different approaches to search session segmentation exist, each considering different f...

متن کامل

Metadata extration and text categorization using Universal Resource Locator expansions

Uniform resource locators (URLs), which mark the address of a resource on the World Wide Web, are often human-readable and can indicate metadata about a resource. This paper explores the mining of URLs to yield categoric metadata about web resources via a three-phase pipeline of word segmentation, abbreviation expansion and classification. I apply this approach to the problem of subject metadat...

متن کامل

Metadata extraction and text categorization using Universal Resource Locator expansions

Uniform resource locators (URLs), which mark the address of a resource on the World Wide Web, are often human-readable and can indicate metadata about a resource. This paper explores the mining of URLs to yield categoric metadata about web resources via a three-phase pipeline of word segmentation, abbreviation expansion and classification. I apply this approach to the problem of subject metadat...

متن کامل

Persian Printed Document Analysis and Page Segmentation

This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011